Past Event: PhD Dissertation Defense

Adaptive and weighted optimization for eficient and robust learning

Yuege (Gail) Xie, CSEM PhD Student, Oden Institute, UT Austin

1 – 3PM
Friday Jul 29, 2022

Zoom Only

Abstract

***Zoom ONLY***

Modern machine learning has made significant breakthroughs for scientific and technological applications and led to paradigm shifts in optimization and generalization theories. Adaptive and weighted optimization have become the workhorses behind today's machine learning applications, but there is still much to learn about why they work in practice and how we can further improve their efficiency and robustness. In this thesis, we first establish the linear convergence of adaptive optimization and then analyze the generalization error of weighted optimization. With these theoretical results, we develop efficient and robust learning algorithms to tackle real-world problems such as model sparsification, image classification, and medical image segmentation.

To establish linear convergence guarantees for AdaGrad-Norm, an adaptive gradient descent algorithm, we develop a two-stage analysis framework and show that the convergence is robust to the initial learning rate. Unlike prior work, our analysis does not require knowledge of smoothness parameters or strong convexity parameters. To understand the generalization of weighted trigonometric interpolation, we derive exact expressions of the generalization error of both plain and weighted least squares estimators. Then we show how a bias towards smooth interpolants can lead to smaller generalization errors in the overparameterized regime.

For efficient sparse model learning, we propose SHRIMP (Sparser Random feature model via Iterative Magnitude Pruning) to adaptively fit high-dimensional data with inherent low-dimensional structure. SHRIMP performs better than other sparse feature models under lower computational complexity while enabling feature selection and being robust to pruning rates.

To further improve the computational efficiency and robustness of AdaGrad-Norm, we propose AdaLoss, an adaptive learning rate schedule that uses only the loss function instead of computing gradient norms. On top of AdaLoss, to fit models to data with noisy labels, we develop a weighted learning rate schedule (AccLossWeight), which uses accumulative losses to learn stepsizes for each sample. Furthermore, we enhance data augmentation consistency regularization with an adaptively weighted schedule (AdaWAC) to handle volumetric medical image segmentation with both sparsely labeled and densely labeled slices. We evaluate our method on CT and MRI scans and demonstrate superior performance over several baselines.

Biography

Yuege Xie is a CSEM PhD student at Oden Institute, where she works with Prof. Rachel Ward at the Center for Scientific Machine Learning. Her research focuses on understanding optimization and generalization in machine learning and deep learning. Before joining UT Austin, she received her bachelor's degree in Mathematics and Applied Mathematics from Zhejiang University, China.

Adaptive and weighted optimization for eficient and robust learning

Event information

Date

1 – 3PM
Friday Jul 29, 2022

Location Zoom Only

Hosted by Rachel Ward

Admin charlott@oden.utexas.edu